Weakly-Supervised Violence Detection in Movies with Audio and Video Based Co-training
نویسندگان
چکیده
In this work, we present a novel method to detect violent shots in movies. The detection process is split into two views–––the audio and video views. From the audio-view, a weakly-supervised method is exploited to improve the classification performance. And from the video-view, we use a classifier to detect violent shots. Finally, the auditory and visual classifiers are combined in a co-training way. The experimental results on several movies with violent contents preliminarily show the effectiveness of our method.
منابع مشابه
Multimodal and ontology-based fusion approaches of audio and visual processing for violence detection in movies
In this paper we present our research results towards the detection of violent scenes in movies, employing advanced fusion methodologies, based on learning, knowledge representation and reasoning. Towards this goal, a multi-step approach is followed: initially, automated audio and visual analysis is performed to extract audio and visual cues. Then, two different fusion approaches are deployed: ...
متن کاملDAI Lab at MediaEval 2012 Affect Task: The Detection of Violent Scenes using Affective Features
We propose an approach to detect violence in movies at video shot level using low-level and mid-level features. We use audio energy, pitch and Mel-Frequency Cepstral Coefficients (MFCC) features to represent the affective audio content of movies. For the affective visual content, we extract average motion information. To learn a model for violence detection, we choose a discriminative classific...
متن کاملAffective Video Retrieval: Violence Detection in Hollywood Movies by Large-Scale Segmental Feature Extraction
Without doubt general video and sound, as found in large multimedia archives, carry emotional information. Thus, audio and video retrieval by certain emotional categories or dimensions could play a central role for tomorrow's intelligent systems, enabling search for movies with a particular mood, computer aided scene and sound design in order to elicit certain emotions in the audience, etc. Yet...
متن کاملIc Ip - 9 8 Audio - Visual Content - Based Violent Scene Characterization
We present a novel technique to characterize and index violent scenes in general TV drama and movies. Our goal is to identify violent signatures and localize violent events within a movie to support \high-level" video indexing. In particular, we exploit multiple \audiovisual" signatures to create a perceptual relation for conceptually meaningful violent scene identi cation. Potential applicatio...
متن کاملMultiple Instance Deep Learning for Weakly Supervised Small-Footprint Audio Event Detection
State-of-the-art audio event detection (AED) systems rely on supervised learning using strongly labeled data. However, this dependence severely limits scalability to large-scale datasets where fine resolution annotations are too expensive to obtain. In this paper, we propose a multiple instance learning (MIL) framework for multi-class AED using weakly annotated labels. The proposed MIL framewor...
متن کامل